Game of Thrones Ratings Per Episode Per Season

⚠ Warning: This Website Contains Spoilers

1 Background and Research Question

“How have the ratings of Game of Thrones episodes evolved over time across different seasons?”

This research question will aim to explore trends in the data such as whether ratings increased or decreased over the course of the show, or whether there were any significant drops after key plot developments, for example.

2 Data Source

The raw data was obtained from the IMDB website which is publicly accessible via this link:

https://www.imdb.com/title/tt0944947/ratings/

The data used in this analysis was extracted from the ‘Ratings by episode’ section of the Game of Thrones page.

I created an excel spreadsheet which perfectly replicated the grid shown on the IMDB page showing each episode rating, making the data accessible for wwrangling and visualisation.

3 Data Preparation

3.1 Package Versions

Library and Version Purpose
tidyverse_2.0.0 for handling data
here_1.0.1 for easy file and directory referencing
readxl_1.4.3 for reading excel files
knitr_1.49 for combining R code with text to create dynamic reports
dplyr_1.1.4 for data manipulation tasks
jpeg_0.10.10 for reading jpeg files
gganimate_1.0.9 for creating animated plots
plotly_4.10.4 for an interactive plot
showtext_0.9.7 for changing fonts

3.2 Load Packages

library(tidyverse)
library(here)
library(readxl)
library(knitr)
library(dplyr)
library(jpeg)
library(gganimate)
library(plotly)
library(showtext)

3.3 Import the Data

#load data from excel file
rawdata <- read_excel(here::here("raw_data", "raw_data.xlsx"))
## New names:
## • `` -> `...1`

R has automatically assigned any empty values in the table to now say “…1”. First, I am going to print the data to see how it looks initially after importing.

#this is a sanity check to inspect the data
print(rawdata)
## # A tibble: 8 × 12
##   ...1     e1    e2    e3    e4    e5    e6    e7    e8    e9   e10   e11
##   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 s1      8.9   8.6   8.5   8.6   9     9.1   9.1   8.9   9.6   9.4   9.2
## 2 s2      8.6   8.3   8.7   8.6   8.6   8.9   8.8   8.6   9.7   9.3  NA  
## 3 s3      8.6   8.4   8.7   9.5   8.9   8.7   8.6   8.9   9.9   9.1  NA  
## 4 s4      9     9.7   8.7   8.7   8.6   9.7   9     9.7   9.6   9.7  NA  
## 5 s5      8.3   8.3   8.3   8.5   8.5   7.9   8.8   9.8   9.4   9.1  NA  
## 6 s6      8.4   9.2   8.6   9     9.7   8.3   8.5   8.3   9.9   9.9  NA  
## 7 s7      8.5   8.8   9.1   9.7   8.7   9     9.4  NA    NA    NA    NA  
## 8 s8      7.6   7.9   7.5   5.5   5.9   4    NA    NA    NA    NA    NA

The data has successfully imported, the next step is to wrangle the data to convert it into a form that can be easily visualised.

4 Data Wrangling

4.1 Renaming the columns

#rename the first column after automatic assignment of "...1"
colnames(rawdata) <- ifelse(colnames(rawdata) == "...1", "Season", colnames(rawdata))

#rename the columns, excluding the first which I have just renamed to remain empty
colnames(rawdata)[-1] <- c("Episode 1", "Episode 2", "Episode 3", "Episode 4", "Episode 5", "Episode 6", "Episode 7", "Episode 8", "Episode 9", "Episode 10", "Episode 11")

#this is a sanity to check to make sure the column headers changed
head(rawdata, n = 1)
## # A tibble: 1 × 12
##   Season `Episode 1` `Episode 2` `Episode 3` `Episode 4` `Episode 5` `Episode 6`
##   <chr>        <dbl>       <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
## 1 s1             8.9         8.6         8.5         8.6           9         9.1
## # ℹ 5 more variables: `Episode 7` <dbl>, `Episode 8` <dbl>, `Episode 9` <dbl>,
## #   `Episode 10` <dbl>, `Episode 11` <dbl>
#change the values in the first column
#removing the 's' just to clean up to view of the table
rawdata$Season <- sub("s", "", rawdata$Season) 

#render the table with kable
kable(rawdata, format = "markdown")
Season Episode 1 Episode 2 Episode 3 Episode 4 Episode 5 Episode 6 Episode 7 Episode 8 Episode 9 Episode 10 Episode 11
1 8.9 8.6 8.5 8.6 9.0 9.1 9.1 8.9 9.6 9.4 9.2
2 8.6 8.3 8.7 8.6 8.6 8.9 8.8 8.6 9.7 9.3 NA
3 8.6 8.4 8.7 9.5 8.9 8.7 8.6 8.9 9.9 9.1 NA
4 9.0 9.7 8.7 8.7 8.6 9.7 9.0 9.7 9.6 9.7 NA
5 8.3 8.3 8.3 8.5 8.5 7.9 8.8 9.8 9.4 9.1 NA
6 8.4 9.2 8.6 9.0 9.7 8.3 8.5 8.3 9.9 9.9 NA
7 8.5 8.8 9.1 9.7 8.7 9.0 9.4 NA NA NA NA
8 7.6 7.9 7.5 5.5 5.9 4.0 NA NA NA NA NA

The above table shows a much cleaner version of the data, however, it is not ready for visualisation yet. Before I take the data and plot it, first I am going to remove the final column containing ‘Episode 11’. The reason for this is that this data is not required in the analysis as this is the unaired original pilot. Audiences never saw this episode and was simply an alternate to the official pilot episode that was released. Therefore, ‘Episode 11’ was excluded from the final dataset.

4.2 Reshaping the data

# Reshape the data to long format for a more flexible structure for visualizing, analyzing, and modeling data. This is easier for ggplot2 to handle.
rawdata_long <- rawdata %>%
  pivot_longer(cols = starts_with("Episode"),  # Select columns that start with "Episode"
               names_to = "Episode",           # Create a new column "Episode"
               values_to = "Rating")          # Create a new column "Rating"

# Exclude Episode 11 from the data
data <- rawdata_long %>%
  filter(str_replace(Episode, "Episode ", "") != "11")

#this is a sanity check to make sure the data is now in a long format
head(data)
## # A tibble: 6 × 3
##   Season Episode   Rating
##   <chr>  <chr>      <dbl>
## 1 1      Episode 1    8.9
## 2 1      Episode 2    8.6
## 3 1      Episode 3    8.5
## 4 1      Episode 4    8.6
## 5 1      Episode 5    9  
## 6 1      Episode 6    9.1
# Save the data as a CSV file, which can be opened in excel.
write.csv(data, "cleaned_data/data.csv", row.names = FALSE)

5 Visualisations

5.1 Basic Visualisation

#Create a basic line plot with minimal customisation
p <- ggplot(data, aes(x = as.integer(str_replace(Episode, "Episode ", "")),  # Convert episode to numeric
                         y = Rating, 
                         color = factor(Season))) +  # Use Season for different lines
  geom_line() +                        # Draw lines
  geom_point() +                       # Add points for each episode
  labs(x = "Episode Number",            # Label for x-axis
       y = "Episode Rating",            # Label for y-axis
       color = "Season Number") +       # Label for the legend
  theme_minimal() +                    # Use a minimal theme for a clean look
  scale_color_viridis_d() +             # Add color scale for different lines
  theme(legend.position = "right")      # Place the legend at the right

#view the plot as a sanity check to assess what direction to take the customisations.
print(p)

6 Customisation

The x axis has automated to appear in intervals of 2.5. This needs recoding so that it shows the numeric episode numbers. To do this, I need to ‘mutate’ the data to remove “Episode” from the string of numbers. This converts the remaining number (e.g., “1”, “2”) into an integer; creating a new column called ‘EpisodeNumber’.

# Preprocess the Episode column
data1 <- data %>%
  mutate(EpisodeNumber = as.integer(str_replace(Episode, "Episode ", "")))

# This is a sanity check to view the new column
print(data1)
## # A tibble: 80 × 4
##    Season Episode    Rating EpisodeNumber
##    <chr>  <chr>       <dbl>         <int>
##  1 1      Episode 1     8.9             1
##  2 1      Episode 2     8.6             2
##  3 1      Episode 3     8.5             3
##  4 1      Episode 4     8.6             4
##  5 1      Episode 5     9               5
##  6 1      Episode 6     9.1             6
##  7 1      Episode 7     9.1             7
##  8 1      Episode 8     8.9             8
##  9 1      Episode 9     9.6             9
## 10 1      Episode 10    9.4            10
## # ℹ 70 more rows

7 Colour

Next, I wanted to add a more personal customisation to the colours on the visualisation. To do this, I assigned a family house sigil to each of the seasons based on major plot points:

Season Number House Major Plot Point
Season 1 Stark Only time all the Starks are together & Death of Eddard Stark
Season 2 Baratheon Death of Renly Baratheon & Battle of Blackwater
Season 3 Lannister Jaime Lannister loses his hand & The Red Wedding
Season 4 Martell Death of Oberyn Martell
Season 5 Tyrell Margaery Tyrell manipulates King’s Landing
Season 6 Arryn Saved the day at the Battle of the Bastards
Season 7 Greyjoy Yara Greyjoy declares herself Queen of the Iron Islands
Season 8 Targaryen Daenerys Targaryen gets the Iron Throne
# Convert the numeric 'Season' column to a factor with appropriate labels
data1$Season <- factor(data1$Season, 
                       levels = 1:8, 
                       labels = c("Season 1", "Season 2", "Season 3", "Season 4", "Season 5", "Season 6", "Season 7", "Season 8"))

# Assign custom colors to each line based on the season
custom_colors <- c(
  "Season 1" = "#7f7f7f",   # Grey for Season 1, House Stark
  "Season 2" = "#ffc406",   # Yellow for Season 2, House Baratheon
  "Season 3" = "#B03060",   # Maroon for Season 3, House Lannister
  "Season 4" = "#ED7014",   # Orange for Season 4, House Martell
  "Season 5" = "#006400",   # Green for Season 5, House Tyrell
  "Season 6" = "#023E8A",   # Blue for Season 6, House Arryn
  "Season 7" = "#000000",   # Black for Season 7, House Greyjoy
  "Season 8" = "#ff0000"    # Red for Season 8, House Targaryen
)

Season 1 : #7f7f7f

Season 2 : #ffc406

Season 3 : #B03060

Season 4 : #ED7014

Season 5 : #006400

Season 6 : #023E8A

Season 7 : #000000

Season 8 : #ff0000

7.1 Customised Visualisation

# Create the plot with new customisations
p1 <- ggplot(data1, aes(x = EpisodeNumber, y = Rating, color = factor(Season))) +  
  geom_line() +                        
  geom_point() +                       
  labs(x = "Episode Number",            
       y = "Episode Rating",            
       color = "",                     # Label for the legend
       caption = "Source: IMDB.com") +  # Add source text at the bottom
  ggtitle("Game of Thrones Episode Ratings Per Season") + # Add a title
  theme_minimal() +  # Clean, minimal theme
  scale_color_manual(values = custom_colors) +  # Apply custom colors for lines
  scale_x_continuous(breaks = seq(1, max(data1$EpisodeNumber), by = 1)) +  # Set x-axis breaks
  scale_y_continuous(
    breaks = seq(4, 10, by = 0.5),  # Set y-axis breaks
    limits = c(4, 10),              # Set y-axis limits
    expand = c(0, 0)                # Remove extra padding
  ) + 
 theme(legend.position = "right") + 
  guides(color = guide_legend(
    keywidth = 2,  # Adjust the size of the legend key (box around the color circle)
    keyheight = 2, # Adjust the size of the legend key (box around the color circle)
    override.aes = list(size = 5)  # Increase the size of the color circles inside the legend
  ))

# Display the plot
print(p1)

8 Font

#load font
font_add_google("Merriweather", "Merriweather", regular.wt = 400)
showtext_auto()

p1_font <- p1 +
  theme(
    plot.title = element_text(family = "Merriweather", size = 36, face = "bold"),  # Change title font
    axis.title = element_text(family = "Merriweather", size = 26),  # Change axis title font
    axis.text = element_text(family = "Merriweather", size = 20),   # Change axis text font
    plot.caption = element_text(family = "Merriweather", size = 20)  # Change caption font
  )

print(p1_font)
## Warning: Removed 7 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 7 rows containing missing values or values outside the scale range
## (`geom_point()`).

9 Interactive Visualisation

interactive_plot <- ggplotly(p1_font)

interactive_plot

10 Animated Visualisations

10.1 Animated Visualisation 1

anim <- p1_font + 
  geom_point() +
  transition_manual(EpisodeNumber, cumulative = TRUE) +
  labs(
    subtitle = "Episode: {frame}"  # Add a dynamic subtitle that changes with each frame
  )

anim

10.2 Animated Visualisation 2

anim2 <- p1_font + 
  geom_point() +
  transition_reveal(EpisodeNumber) +
  labs(
    subtitle = "Episode: {frame_along}"  # Ensure the subtitle is in line with the x axis
  )

anim2